18 research outputs found

    Unordered feature tracking made fast and easy

    Get PDF
    International audienceWe present an efficient algorithm to fuse two-view correspondences into multi-view consistent tracks. The proposed method relies on the Union-Find algorithm to solve the fusion problem. It is very simple and has a lower computational complexity than other available methods. Our experiments show that it is faster and computes more tracks

    Python Photogrammetry Toolbox: A free solution for Three-Dimensional Documentation

    Get PDF
    International audienceThe modern techniques of Structure from Motion (SfM) and Image-Based Modelling (IBM) open new perspectives in the field of archaeological documentation, providing a simple and accurate way to record three-dimensional data. In the last edition of the workshop, the presentation "Computer Vision and Structure From Motion, new methodologies in archaeological three-dimensional documentation. An open source approach." showed the advantages of this new methodology (low cost, portability, versatility ...), but it also identified some problems: the use of the closed feature detector SIFT source code and the necessity of a simplification of the workflow. The software Python Photogrammetry Toolbox (PPT) is a possible solution to solve these problems. It is composed of python scripts that automate the different steps of the workflow. The entire process is reduced in two commands, calibration and dense reconstruction. The user can run it from a graphical interface or from terminal command

    Positionnement robuste et prĂ©cis de rĂ©seaux d’images

    Get PDF
    To compute a 3D representation of a rigid scene from a collection of pictures is now possible thanks to the progress made by the multiple-view stereovision methods, even with a simple camera. The reconstruction process, arising from photogrammetry, consists in integrating information from multiple images taken from different viewpoints in order to identify the relative positions and orientations. Once the positions and orientations (external calibration) of the cameras are retrieved, the structure of the scene can be reconstructed. To solve the problem of calculating the Structure from Motion (SfM), sequential and global methods have been proposed. By nature, sequential methods tend to accumulate errors. This is observable in trajectories of cameras that are subject to drift error. When pictures are acquired around an object it leads to reconstructions where the loops do not close. In contrast, global methods consider the network of cameras as a whole. The configuration of cameras is searched and optimized in order to preserve at best the constraints of the cyclical network. Reconstructions of better quality can be obtained, but at the expense of computation time. This thesis aims at analyzing critical issues at the heart of these methods of external calibration and at providing solutions to improve their performance(accuracy , robustness and speed) and their ease of use (restricted parametrization).We first propose a fast and efficient feature tracking algorithm. We then show that the widespread use of a contrario robust estimation of parametric models frees the user from choosing detection thresholds, and allows obtaining a reconstruction pipeline that automatically adapts to the data. Then in a second step, we use the adaptive robust estimation and a series of convex optimizations to build a scalable global calibration chain. Our experiments show that the a contrario based estimations improve significantly the quality of the pictures positions and orientations, while being automatic and without parameters, even on complex camera networks. Finally, we propose to improve the visual appearance of the reconstruction by providing a convex optimization to ensure the color consistency between imagesCalculer une reprĂ©sentation 3D d'une scĂšne rigide Ă  partir d'une collection d'images est aujourd'hui possible grĂące aux progrĂšs rĂ©alisĂ©s par les mĂ©thodes de stĂ©rĂ©o-vision multi-vues, et ce avec un simple appareil photographique. Le principe de reconstruction, dĂ©coulant de travaux de photogrammĂ©trie, consiste Ă  recouper les informations provenant de plusieurs images, prises de points de vue diffĂ©rents, pour identifier les positions et orientations relatives de chaque clichĂ©. Une fois les positions et orientations de camĂ©ras dĂ©terminĂ©es (calibration externe), la structure de la scĂšne peut ĂȘtre reconstruite. Afin de rĂ©soudre le problĂšme de calcul de la structure Ă  partir du mouvement des camĂ©ras (Structure-from-Motion), des mĂ©thodes sĂ©quentielles et globales ont Ă©tĂ© proposĂ©es. Par nature, les mĂ©thodes sĂ©quentielles ont tendance Ă  accumuler les erreurs. Cela donne lieu le plus souvent Ă  des trajectoires de camĂ©ras qui dĂ©rivent et, lorsque les photos sont acquises autour d'un objet, Ă  des reconstructions oĂč les boucles ne se referment pas. Au contraire, les mĂ©thodes globales considĂšrent le rĂ©seau de camĂ©ras dans son ensemble. La configuration de camĂ©ras est recherchĂ©e et optimisĂ©e pour conserver au mieux l'ensemble des contraintes de cyclicitĂ© du rĂ©seau. Des reconstructions de meilleure qualitĂ© peuvent ĂȘtre obtenues, au dĂ©triment toutefois du temps de calcul. Cette thĂšse propose d'analyser des problĂšmes critiques au cƓur de ces mĂ©thodes de calibration externe et de fournir des solutions pour amĂ©liorer leur performance (prĂ©cision, robustesse, vitesse) et leur facilitĂ© d'utilisation (paramĂ©trisation restreinte).Nous proposons tout d'abord un algorithme de suivi de points rapide et efficace. Nous montrons ensuite que l'utilisation gĂ©nĂ©ralisĂ©e de l'estimation robuste de modĂšles paramĂ©triques a contrario permet de libĂ©rer l'utilisateur du rĂ©glage de seuils de dĂ©tection, et d'obtenir une chaine de reconstruction qui s'adapte automatiquement aux donnĂ©es. Puis dans un second temps, nous utilisons ces estimations robustes adaptatives et une formulation du problĂšme qui permet des optimisations convexes pour construire une chaine de calibration globale capable de passer Ă  l'Ă©chelle. Nos expĂ©riences dĂ©montrent que les estimations identifiĂ©es a contrario amĂ©liorent de maniĂšre notable la qualitĂ© d'estimation de la position et de l'orientation des clichĂ©s, tout en Ă©tant automatiques et sans paramĂštres, et ce mĂȘme sur des rĂ©seaux de camĂ©ras complexes. Nous proposons enfin d'amĂ©liorer le rendu visuel des reconstructions en proposant une optimisation convexe de la consistance colorĂ©e entre image

    Estimation robuste de modÚle a contrario, impact sur la précision en structure from motion

    Get PDF
    L'estimation de modĂšle consiste Ă  identiïŹer un modĂšle parmi des donnĂ©es bruitĂ©es. Ce problĂšme n'est pas trivial et l'Ă©tat de l'art prĂ©sente de nombreuses solutions pour rĂ©soudre ce problĂšme. Le plus souvent les solutions max-consensus ou RANSAC sont utilisĂ©es. Ces solutions proposent de rechercher par tirages alĂ©atoires plusieurs solutions et de conserver celle qui prĂ©sente le plus grand cardinal face Ă  une prĂ©cision donnĂ©e a priori. Ces solutions, malgrĂš leur simplicitĂ©, prĂ©sentent un dĂ©faut majeur : un seuil d'acception des donnĂ©es T doit ĂȘtre spĂ©ciïŹĂ©. Il se pose alors la question du choix de ce paramĂštre. Choisir un seuil trop grand va donner lieu Ă  une sur-estimation des donnĂ©es valides et l'on va introduire des donnĂ©es bruitĂ©es dans le modĂšle alors que choisir un seuil trop petit donne lieu Ă  une sous-estimation et une imprĂ©cision du modĂšle. Nous proposons de discuter la solution AC-RANSAC pour le Structure from Motion et son impact sur la prĂ©cision des positions de camĂ©ras estimĂ©es

    Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations

    Full text link
    Can conversational videos captured from multiple egocentric viewpoints reveal the map of a scene in a cost-efficient way? We seek to answer this question by proposing a new problem: efficiently building the map of a previously unseen 3D environment by exploiting shared information in the egocentric audio-visual observations of participants in a natural conversation. Our hypothesis is that as multiple people ("egos") move in a scene and talk among themselves, they receive rich audio-visual cues that can help uncover the unseen areas of the scene. Given the high cost of continuously processing egocentric visual streams, we further explore how to actively coordinate the sampling of visual information, so as to minimize redundancy and reduce power use. To that end, we present an audio-visual deep reinforcement learning approach that works with our shared scene mapper to selectively turn on the camera to efficiently chart out the space. We evaluate the approach using a state-of-the-art audio-visual simulator for 3D scenes as well as real-world video. Our model outperforms previous state-of-the-art mapping methods, and achieves an excellent cost-accuracy tradeoff. Project: http://vision.cs.utexas.edu/projects/chat2map.Comment: Accepted to CVPR 202

    Robust and accurate calibration of camera networks

    No full text
    Calculer une reprĂ©sentation 3D d'une scĂšne rigide Ă  partir d'une collection d'images est aujourd'hui possible grĂące aux progrĂšs rĂ©alisĂ©s par les mĂ©thodes de stĂ©rĂ©o-vision multi-vues, et ce avec un simple appareil photographique. Le principe de reconstruction, dĂ©coulant de travaux de photogrammĂ©trie, consiste Ă  recouper les informations provenant de plusieurs images, prises de points de vue diffĂ©rents, pour identifier les positions et orientations relatives de chaque clichĂ©. Une fois les positions et orientations de camĂ©ras dĂ©terminĂ©es (calibration externe), la structure de la scĂšne peut ĂȘtre reconstruite. Afin de rĂ©soudre le problĂšme de calcul de la structure Ă  partir du mouvement des camĂ©ras (Structure-from-Motion), des mĂ©thodes sĂ©quentielles et globales ont Ă©tĂ© proposĂ©es. Par nature, les mĂ©thodes sĂ©quentielles ont tendance Ă  accumuler les erreurs. Cela donne lieu le plus souvent Ă  des trajectoires de camĂ©ras qui dĂ©rivent et, lorsque les photos sont acquises autour d'un objet, Ă  des reconstructions oĂč les boucles ne se referment pas. Au contraire, les mĂ©thodes globales considĂšrent le rĂ©seau de camĂ©ras dans son ensemble. La configuration de camĂ©ras est recherchĂ©e et optimisĂ©e pour conserver au mieux l'ensemble des contraintes de cyclicitĂ© du rĂ©seau. Des reconstructions de meilleure qualitĂ© peuvent ĂȘtre obtenues, au dĂ©triment toutefois du temps de calcul. Cette thĂšse propose d'analyser des problĂšmes critiques au cƓur de ces mĂ©thodes de calibration externe et de fournir des solutions pour amĂ©liorer leur performance (prĂ©cision, robustesse, vitesse) et leur facilitĂ© d'utilisation (paramĂ©trisation restreinte).Nous proposons tout d'abord un algorithme de suivi de points rapide et efficace. Nous montrons ensuite que l'utilisation gĂ©nĂ©ralisĂ©e de l'estimation robuste de modĂšles paramĂ©triques a contrario permet de libĂ©rer l'utilisateur du rĂ©glage de seuils de dĂ©tection, et d'obtenir une chaine de reconstruction qui s'adapte automatiquement aux donnĂ©es. Puis dans un second temps, nous utilisons ces estimations robustes adaptatives et une formulation du problĂšme qui permet des optimisations convexes pour construire une chaine de calibration globale capable de passer Ă  l'Ă©chelle. Nos expĂ©riences dĂ©montrent que les estimations identifiĂ©es a contrario amĂ©liorent de maniĂšre notable la qualitĂ© d'estimation de la position et de l'orientation des clichĂ©s, tout en Ă©tant automatiques et sans paramĂštres, et ce mĂȘme sur des rĂ©seaux de camĂ©ras complexes. Nous proposons enfin d'amĂ©liorer le rendu visuel des reconstructions en proposant une optimisation convexe de la consistance colorĂ©e entre imagesTo compute a 3D representation of a rigid scene from a collection of pictures is now possible thanks to the progress made by the multiple-view stereovision methods, even with a simple camera. The reconstruction process, arising from photogrammetry, consists in integrating information from multiple images taken from different viewpoints in order to identify the relative positions and orientations. Once the positions and orientations (external calibration) of the cameras are retrieved, the structure of the scene can be reconstructed. To solve the problem of calculating the Structure from Motion (SfM), sequential and global methods have been proposed. By nature, sequential methods tend to accumulate errors. This is observable in trajectories of cameras that are subject to drift error. When pictures are acquired around an object it leads to reconstructions where the loops do not close. In contrast, global methods consider the network of cameras as a whole. The configuration of cameras is searched and optimized in order to preserve at best the constraints of the cyclical network. Reconstructions of better quality can be obtained, but at the expense of computation time. This thesis aims at analyzing critical issues at the heart of these methods of external calibration and at providing solutions to improve their performance(accuracy , robustness and speed) and their ease of use (restricted parametrization).We first propose a fast and efficient feature tracking algorithm. We then show that the widespread use of a contrario robust estimation of parametric models frees the user from choosing detection thresholds, and allows obtaining a reconstruction pipeline that automatically adapts to the data. Then in a second step, we use the adaptive robust estimation and a series of convex optimizations to build a scalable global calibration chain. Our experiments show that the a contrario based estimations improve significantly the quality of the pictures positions and orientations, while being automatic and without parameters, even on complex camera networks. Finally, we propose to improve the visual appearance of the reconstruction by providing a convex optimization to ensure the color consistency between image

    Automatic Homographic Registration of a Pair of Images, with A Contrario Elimination of Outliers

    No full text
    International audienceThe RANSAC algorithm (RANdom SAmple Consensus) is a robust method to estimate parameters of a model fitting the data, in presence of outliers among the data. Its random nature is due only to complexity considerations. It iteratively extracts a random sample out of all data, of minimal size sufficient to estimate the parameters. At each such trial, the number of inliers (data that fits the model within an acceptable error threshold) is counted. In the end, the set of parameters maximizing the number of inliers is accepted. The variant proposed by Moisan and Stival consists in introducing an a contrario criterion to avoid the hard thresholds for inlier/outlier discrimination. It has three consequences: The threshold for inlier/outlier discrimination is adaptive, it does not need to be fixed. It gives a decision on the adequacy of the final model: it does not provide a wrong set of parameters if it does not have enough confidence. The procedure to draw a new sample can be amended as soon as one set of parameters is deemed meaningful: the new sample can be drawn among the inliers of this model. In this particular instantiation, we apply it to the estimation of the homography registering two images of the same scene. The homography is an 8-parameter model arising in two situations when using a pinhole camera: the scene is planar (a painting, a facade, etc.) or the viewpoint location is fixed (pure rotation around the optical center). When the homography is found, it is used to stitch the images in the coordinate frame of the second image and build a panorama. The point correspondences between images are computed by the SIFT algorithm

    Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion

    Get PDF
    International audienceMulti-view structure from motion (SfM) estimates the position and orientation of pictures in a common 3D coordinate frame. When views are treated incrementally, this external calibration can be subject to drift, contrary to global methods that distribute residual errors evenly. We propose a new global calibration approach based on the fusion of relative motions between image pairs. We improve an existing method for robustly computing global rotations. We present an efficient a contrario trifocal tensor estimation method, from which stable and precise translation directions can be extracted. We also define an efficient translation registration method that recovers accurate camera positions. These components are combined into an original SfM pipeline. Our experiments show that, on most datasets, it outperforms in accuracy other existing incremental and global pipelines. It also achieves strikingly good running times: it is about 20 times faster than the other global method we could compare to, and as fast as the best incremental method. More importantly, it features better scalability properties
    corecore